-
Notifications
You must be signed in to change notification settings - Fork 19
Ohcl tdx timer service support #107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: product/hcl-main/6.12
Are you sure you want to change the base?
Ohcl tdx timer service support #107
Conversation
1e0e3f7 to
4b8ffbb
Compare
|
Now this kernel successfully boots L2 linux kernel. So I removed work-in-progress |
a6df45b to
8cce60e
Compare
TD partitioning provides a timer service for L1 (VTL2) guest to set a preemption timer for L2 (VTL0) vCPUs. Add members for a new timer service to the tdx_vp_context struct for the L1 (VTL2) userspace to pass a timeout value down to the L1 (VTL2) kernel. Signed-off-by: Isaku Yamahata <[email protected]>
Refactor __tdcall() for a dedicated wrapper for TDG.VP.WR() operation. This prepares for additional calls of TDG.VP.WR() cleanly while avoiding repeated open-coding. No functional change intended. Signed-off-by: Isaku Yamahata <[email protected]>
| * | ||
| * TDX TDVPS deadline: | ||
| * 0: immediate inject timer interrupt. | ||
| * -1: disarmed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this disarmed value present in the spec, or just an effect of setting an all Fs TSC value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's in the public spec,
Intel® Trust Domain Extensions (Intel® TDX) Module
TD Partitioning Architecture Specification
354807-005US
September 2025
23.13.2. L2 VM TSC Deadline Support
Setting TSC_DEADLINE to -1 disables its operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add this reference on the next update.
drivers/hv/mshv_vtl_main.c
Outdated
| }; | ||
|
|
||
| /* | ||
| * The L1 VMM needs to tell wake up time from HLT emulation because The host |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: capitalization here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, will fix. Do you mean "The" => "the" after "because". If not, please concretely point out which word to captlize.
| } | ||
| raw_local_irq_enable(); | ||
| } else { | ||
| enum TDX_HALT_TIMER armed; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think i'd want some other reviewers to chime in on how they want to manage this TDX specific code here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this change would be intrusive, review from someone else would help.
drivers/hv/mshv_vtl_main.c
Outdated
| struct mshv_vtl_per_cpu *per_cpu = this_cpu_ptr(&mshv_vtl_per_cpu); | ||
| u64 vm_idx = TDG_VP_ENTRY_VM_IDX(context->entry_rcx); | ||
|
|
||
| if (is_tdx_vm_idx_valid(vm_idx)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we use the prev value here because there wasn't an update call on this run? This handles the case when the timer was disarmed or disabled by the guest (because we set a large value of 0xFFs), is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we use the prev value here because there wasn't an update call on this run?
Yes.
This handles the case when the timer was disarmed or disabled by the guest (because we set a large value of 0xFFs), is that right?
Yes. There are several cases covered. The scenarios are
- the L2(VTL0) guest updates timer => userspace openvmm sets deadline and update=1 and run L2 vCPU. This can be arming or disarming depending on the value.
- optional: The kernel may run L2 vCPU and back to L1(VTL2) before expiring timer. Set case update = 0, and remember the value in previous value.
- In L1 kernel, go to HLT emulation. In the sentence try to find the timer expiring value. in context if update = 1 or remembered previous value.
Program the TD partitioning TSC deadline timer service for L2 (VTL0) vCPUs when the L1 (VTL2) userspace requests. Then, the TDX module sets preemption timer for L2 vCPU. If the timer expires, the L2 (VTL0) vCPU exits with a VMX preemption timer exit reason. The mshv_vtl driver then exits to the userspace, and the userspace is notified of the exit. The TDX module does not clear TDVPS deadline on a preemption timer exit. Disarm the TSC deadline explicitly on the preemption timer exit. Otherwise the following TDG.VP.ENTER() immediately exits without executing the L2 guest. Signed-off-by: Isaku Yamahata <[email protected]>
As the tdcall is slow, cache the previously written TSC deadline value and skip unnecessary tdg.vp.wr(TSC deadline) if the value doesn't change. This is also a preparation for hlt emulation case that requires the previously written TSC deadline value. Signed-off-by: Isaku Yamahata <[email protected]>
The TDX timer service sets a preemption timer for the L2 (VTL0) vCPU. tdg.vp.enter() exits with preemption timer exit reason on timer expiry. The HLT emulation path needs extra change where the L1 (VTL2) kernel issues TDG.VP.VMCALL(HLT) because the host (L0) VMM doesn't know the L2 deadline timer value. When the L1 kernel issues TDG.VP.VMCALL(HLT), start per-CPU hrtimer to wake up from the L0 HLT emulation by L1 getting timer interrupt. Cancel the hrtimer after it returns from the L0 VMM. Signed-off-by: Isaku Yamahata <[email protected]>
On timer expiry path, it unconditionally issues tdg.vp.wr(TSC deadline = disarm). The following tdg.vp.enter() execution path may overwrite tdg.vp.wr(new TSC deadline). Delete the duplicated tdg.vp.wr() call as optimization. Signed-off-by: Isaku Yamahata <[email protected]>
Add an extension for the TDX timer service, so that the userspace can query the feature before use. Signed-off-by: Isaku Yamahata <[email protected]>
…akeup AP callback") The commit df21bf3 ("arch/x86: Provide the CPU number in the wakeup AP callback") changed the signature of struct apic::wakeup_secondary_cpu(), but it did not update numachip_wakeup_secondary(). Update it to fix the compile error. arch/x86/kernel/apic/apic_numachip.c:228:43: error: initialization of 'int (*)(u32, long unsigned int, unsigned int)' {aka 'int (*)(unsigned int, long unsigned int, unsigned int)'} from incompatible pointer type 'int (*)(u32, long unsigned int)' {aka 'int (*)(unsigned int, long unsigned int)'} [-Wincompatible-pointer-types] 228 | .wakeup_secondary_cpu = numachip_wakeup_secondary, | ^~~~~~~~~~~~~~~~~~~~~~~~~ Fixes: df21bf3 ("arch/x86: Provide the CPU number in the wakeup AP callback") Signed-off-by: Isaku Yamahata <[email protected]>
ac07f44 to
49cea0c
Compare
TDX module supports timer service for L1. When L1 writes tdcs value with TDG.VP.WR(TSC DEADLINE), tdg.vp.enter() exits with timer preemption when deadline expires. (Not injecting timer interrupt to L2 guest).
Update tdx vp context shared between L1 kernel and L1 userspace so that openvmm can use TDX timer service.
Unless the userspace uses it, the L1 OHCL kernel behavior keeps the same behavior as before.